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BINARY STATE MACHINE SYSTEM AND METHOD FOR REGEX 
PROCESSING OF A DATA STREAM IN AN INTRUSION DETECTION 
SYSTEM 



TECHNICAL FIELD OF THE INVENTION 

The present invention relates in general to computer 
network intrusion detection systems and, more 
particularly, to a binary state machine system and method 
5 for REGEX processing of a data stream in an intrusion 
detection system. 



10 
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BACKGROUND OF THE INVENTION 

Security is a major concern among operators of 
computer networks. Attacks upon computer networks can 
disrupt the service of the computer network and can 
5 potentially expose proprietary data to such persons 
committing attacks. 

Network security products such as intrusion 
detection systems (ID systems) can use a passive 
filtering technique to detect policy violations and/or 

10 patterns of misuse that indicate an attack is occurring 
or is imminent. The passive filtering technique usually 
comprises monitoring traffic upon or outside of the 
computer network for packets of data and comparing these 
packets of data with known "attack signatures." 

15 Some conventional ID systems use a string matching 

algorithm as a passive filtering technique. A string- 
matching algorithm takes a string of characters from the 
input stream and compares the string of characters to 
known attack signatures. For example, some conventional 

2 0 ID systems can use a UNIX-based regular expression 

(REGEX) engine to search through an input stream for 
character strings that match an attack signature. In 
such a system, first the input stream is buffered. The 
buffer is compared to the REGEX command which searches 
25 through the entire buffer in order to find a match. If a 
match is found, indicating an attack, an alarm is sounded 
or preventative measures are taken. 

There are several disadvantages with such a 
conventional ID system. For example, excessive memory 

3 0 and CPU resources are required during buffer 

manipulation. Furthermore, it may be necessary to 
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examine individual characters in the data buffer several 
times in order to find a valid match. Furthermore, as 
data traffic increases such problems lead to the 
possibility of dropping packets, resulting in a potential 
5 failure to detect an attack. 

An additional complicating factor is that for some 
attacks, there can be a large amount of irrelevant data 
between strings of relevant data. As result, these 
conventional systems can require the buffering of large 
10 portions of the input stream, and it can be necessary to 
search portions these buffered portions multiple times. 
Furthermore, it can be possible for an attack to cross a 
buffer boundary, leading to the possibility of missing 
the attack. 

15 
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STJMMARY OF THE INVENTION 

In accordance with the present invention, a binary 
state machine system and method are disclosed that 
provide significant advantages over prior developed 
5 string searching algorithms for intrusion detection 
systems . 

According to one aspect of the present invention, a 
method for using a binary state machine for processing a 
data stream in an intrusion detection system comprises 

10 maintaining a state table. The state table is indexed 

such that inputs comprising a current state and a current 
character yield an output of a new state. The new state 
is related to an indication of an attack on a computer 
network. The method further includes maintaing a current 

15 state. An input stream comprising a plurality of 

characters is received. A first character of the input 
stream is selected as the current character. The current 
character and the current state are compared to the state 
table to generate a new state. 

2 0 According to another aspect of the present 

invention, a system for use as a binary state machine for 
processing a data stream in an intrusion detection system 
comprises a state table. The state table is indexed such 
that inputs comprising a current state and a current 

25 character yield an output of a new state. The new state 
is related to an attack on a computer network. The 
system further comprises a state machine communicatively 
coupled to the state table. The state machine is 
operable to maintain the current state and receive an 

30 input stream comprising a plurality of characters. The 
state machine is further operable to select a first 
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character of the input stream as the current character 
and compare the current character and the current state 
to the state table to generate a new state. 

It is a technical advantage of the present invention 
5 that each character in the input stream need be examined 
only once. This advantage reduces the need for excessive 
buffering of data, and preserves CPU and memory resources 
of an ID system incorporating the invention. 

It is another technical advantage that the present 
10 invention allows for more efficient searching of the 

input stream. This reduces the possibility of dropping 
packets, and increases the probability of detecting 
attacks . 

It is a further technical advantage that it accounts 
15 for irrelevant data between strings of relevant data in 
an input stream without the need for excessive buffering 
of data. This eliminates the problem in conventional 
systems that occurs when an attack crosses the buffer 
boundary . 

2 0 Other technical advantages should be apparent to one 

of ordinary skill in the art in view of the 
specification, claims, and drawings. 
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RT?TF,F DESCRIPTION OF THE DRAWINGS 

A more complete understanding of the present 
invention and advantages thereof may be acquired by- 
referring to the following description taken in 
5 conjunction with the accompanying drawings, in which like 
reference numbers indicate like features, and wherein: 
FIGURE 1 is a block diagram of a computer network 
including an intrusion detection system having one 
embodiment of a state machine for processing data 
10 streams; 

FIGURE 2 is a flow diagram of one embodiment of a 
method for processing data streams using a binary state 
machine; and 

FIGURES 3A through 3C are block diagrams showing the 
15 operation of one embodiment of a binary state machine for 
an example data stream. 
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DETAILED DESCRIPTION OF THE INVENTION 

FIGURE 1 is a block diagram of a computer network 
including an intrusion detection system having one 
embodiment of a state machine for processing data 
5 streams. A computer network, indicated generally at 10, 
includes a protected network 12 and an unprotected 
network 14. Protected network 12 is a "internal" 
network, meaning that only authorized users are permitted 
to pass data between systems coupled within protected 

10 network 12 . 

Protected network 12 includes several network 
devices coupled to a network medium 22 . In the 
embodiment of FIGURE 1, the network devices include a 
workstation 20, a server 23, a router 25, and a printer 

15 27. Network 22 can include any network medium and 

protocol thereon, such as Ethernet. Firewall 18 couples 
to network medium 22 and separates protected network 12 
from an unprotected network 14. Firewall 18 is intended 
to operate as a gate keeper, preventing unauthorized 

2 0 users on unprotected network 14 to interfere with the 

operation of protected network 12 . Firewall 18 further 
allows authorized users on unprotected network 14 to 
access protected network 12. In the example of FIGURE 1, 
unprotected network 14 includes network medium 2 8 coupled 

2 5 to the Internet 26. 

An intrusion detection system 3 0 couples to network 
medium 22, network medium 28, or firewall 18. Intrusion 
detection system 3 0 may comprise, for example, software 
code executing on a computing platform separate from 

30 other network devices. Alternatively, intrusion 

detection system 30 can include functionality integrated 
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within firewall 18 or some other network device on 
protected network 12 . 

Intrusion detection system 3 0 has access to network 
traffic on computer medium 28 and/or computer medium 22. 
5 As shown, intrusion detection system 3 0 can couple to 
network medium 2 8 on unprotected network 14. 
Alternatively, as further shown in FIGURE 1, intrusion 
detection system 3 0 can couple to firewall 18, wherein 
firewall 18 passes network traffic to intrusion detection 
10 system. 

Intrusion detection system 30 includes state machine 
32 communicatively coupled to state tables 36. Intrusion 
detection system 3 0 and state machine 32 can comprise, 
for example, executable code stored on a computer 

15 readable medium and executable by a processor 

communicatively coupled to the computer readable medium. 
Alternatively, such components can be implemented in 
firmware or an ASIC implementation. 

In operation, network traffic travels on network 

20 medium 28. The network traffic accessing firewall 18 and 
attempting to access firewall 18 form an input stream 29 
to intrusion detection system 30. For example, input 
stream 2 9 may include data intended, for example, to one 
of the network devices within protected network 12 . 

2 5 However, input stream 2 9 may also include some type of 
attack upon protected network 12 . 

Intrusion detection system 3 0 can recognize if input 
stream 2 9 includes an attack signature, meaning data 
within input stream 29 indicates an attack is occurring 

30 on protected network 12. Intrusion detection system 
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detects the hostile nature of data within input stream 2 9 
by practicing the present invention. 

Intrusion detection system 30 access input stream 29 
and communicates input stream 29 to state machine 32. 
5 State machine 32 maintains a current state 33. State 
machine 32 further selects a character 38 from input 
stream 2 9 as a current character. State machine 3 2 then 
compares current state 3 3 and current character 3 8 to 
state tables 36 to determine a new state 35. 

10 State tables 3 6 are indexed such that inputs 

comprising the current state and a current character 
yield an output of a new state. State tables 3 6 are 
formed such that the new state can be related to an 
indication of an attack on a computer network, as 

15 explained with respect to later FIGURES. 

By using such an implementation, intrusion detection 
system 3 0 improves the efficiency of detection of attack 
signatures over conventional intrusion detection systems. 
For example, each character in the input stream is 

20 processed by intrusion detection system 30 only once. 
After the new state based upon that character and the 
current state is determined, there is no longer a need to 
buffer that character. As such, the implementation of 
FIGURE 1 eliminates the drain upon processor and memory 

2 5 resources taken up by buffer manipulation. 

FIGURE 2 is a flow diagram of one embodiment of a 
method for processing data streams using a binary state 
machine. The method starts at step 50. At step 54, a 
REGEX attack signature is defined. At step 58, a state 

3 0 table is created corresponding to that REGEX attack 

signature. The objective of step 58 is to create a state 
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table indexed such that if a current state is known and a 
current character is known a new state can be generated. 
The REGEX command can include, for example, a character 
string or combination of character strings that indicate 
5 a particular attack is occurring upon a computer network. 
There are many such strings and many such attacks known 
in the art . 

At step 62, the system is initialized and the 
"current state" is set. Such a step may be necessary, 

10 for example, before the input stream is detected. At 
step 66, the data stream is received. At step 68, the 
"current character" is selected as a character of the 
data stream received in step 66. For example, the first 
such character in the data stream can be selected as the 

15 first ""^current character." 

At step 70, the "current state" and "current 
character" of the data stream are compared to the state 
table in order to generate a "new state." At step 74, 
the system checks to see if the "new state" is equal to 

2 0 an state indicating an attack is occurring. Most, if not 
all, attacks will have a signature comprise more than one 
character. Therefore, the state table to detect such an 
attack will include more than one state. If an attack is 
detected, at step 78 an alarm is generated or a response 

25 is created. For example, such an alarm may be an 

indication transmitted to an operator on the network. A 
response could include the implementation of a 
counterraeasure--for example resetting a connection. If 
the new state does indicate an attack at step 74, the 

30 method continues to step 82. 
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At step 82, it is decided whether or not to continue 
the method. If not, the process ends at step 90. If the 
method continues, at step 86 the "current state" is set 
to equal the "new state." Then, at step 88, the "current 
5 character" is set to equal the next character in the data 
stream. The method then continues to step 70 to repeat 
the comparing step. 

As can be seen in the method of FIGURE 2 , each 
character in the data stream is input into the state 

10 table along with the current state to create a new state. 
This new state is then used with the next character in 
the data stream to create another new state . 

The method further shows how an intrusion detection 
system implementing such a system can attain many 

15 advantages. The need for extensive data buffering is 

eliminated, because each character need only be examined 
a once, and compared to the state table once. Such a 
system implementing the method of FIGURE 2 would improve 
efficiency as it would require fewer processing and 

20 memory resources. As such, an intrusion detection system 
employing the method of FIGURE 2 will have fewer 
instances of dropped packets or missed signatures as 
compared to conventional intrusion detection systems. 

FIGURES 3 A through 3C are block diagrams showing the 

25 operation of one embodiment of a binary state machine for 
an example data stream. 

FIGURE 3A shows an example of an input stream 100. 
In the example of FIGURE 3A, input stream 10 0 includes 
the characters: "the dog ran home." FIGURE 3A further 

3 0 shows the REGEX command 104 of an attack signature. In 
the example of FIGURE 3A, the REGEX command is 
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"dog. *home. " This REGEX command indicates that an attack 
is indicated if: (1) the characters "d" , "o" , "g" are 
observed in order; (2) any number of characters occur 
after the "g"; and then (3) the characters "h" , "o" , "m" , 
5 and "e" occur in order. As can be seen, with this attack 
signature, a properly executing intrusion detection 
system will recognize that input stream 100 indicates an 
attack is occurring on the computer network. 

FIGURE 3B shows one embodiment of a state table 

10 formed from the REGEX command 104 of FIGURE 3A. State 
table 112 includes a state index 129 and a character 
index 120. State index 129 corresponds to the "current 
state" as referenced in the method of FIGURE 2 . 
Character index 12 0 corresponds to the "current 

15 character" as referenced in the method of FIGURE 2 . In 

the embodiment of FIGURE 2, character index 120 comprises 
ASCII codes. 

State table 112 is indexed such that inputs 
comprising a current state and a current character yield 

20 an output of a new state. The new state is related to an 
indication of an attack on a computer network. In the 
embodiment of FIGURE 2, state number 8 is the state at 
which the "dog.*home" attack signature is detected. 
In operation, given a current state 129 and a 

25 current character 120, table 112 generates a new state. 
For example, if the current state 124 is "1" and the 
current character 120 is "O" , represented by ASCII code 
"79," the new state is "2." The state of "2" does not 
yet indicate an attack is occurring. Next, the current 

3 0 state 124 is set to "2" and the current character 12 0 is 
set to the next character in the data stream. As seen by 
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state table 112, if the next character in the data stream 
is any character except for '"G" , the new state will be 
"0." This indicates that the attack this state table is 
designed to detect requires a "G" to immediately succeed 
5 an "O." On the other hand, if the next character after 
the "O" is a "G" , it can be seen that state table 112 
will generate a new state of "3." 

FIGURE 3C shows the operation of the state machine 
of the current invention using the input stream 100 of 

10 FIGURE 3A, the REGEX command 104 of FIGURE 3A, and the 

state table 112 of FIGURE 3B. For example, in the first 
block, current state 125 is "0", and current character 
127 ''T" , which is the first character in data stream 100. 
A new state 12 9 is generated by these inputs indexing 

15 state table 112. New state 129 with these inputs is "0" . 
FIGURE 3C shows the current state 125, current character 
12 7, and new state 12 9 of each character of data stream 
100 in FIGURE 3A. 

The present invention is described with respect to 

20 ASCII character sets. However, the present invention 
contemplates that an intrusion detection system 
implementing the method of the invention can be capable 
of interpreting character sets other than the ASCII 
character sets. Additionally, the present invention 

2 5 contemplates that many state tables can be integrated 

into a single intrusion detection system, such that many 
different attack signatures can be detected. 

Although the present invention has been described in 
detail, it should be understood that various changes, 

3 0 substitutions and alterations can be made thereto without 
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departing from the spirit and scope of the invention as 
defined by the appended claims. 
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WHAT IS CLAIMED IS: 

1 . A method for using a binary state machine for 
processing a data stream in an intrusion detection 
system, the method comprising: 

maintaining a state table, the state table 
indexed such that inputs comprising a current state and a 
current character yield an output of a new state, the new 
state related to an indication of an attack on a computer 
network; 

maintaining the current state; 

receiving an input stream, the input stream 
comprising a plurality of characters; 

selecting a first character of the input stream 
as the current character; and 

comparing a current character and the current 
state to the state table to generate a new state. 

2. The method of Claim 1, further comprising 
initializing the current state to an initial state. 

3. The method of Claim 1, further comprising: 
setting the current state equal to the new 

state; 

selecting a next character as the current 
character, the next character appearing subsequent to the 
first character in the input stream; and 

repeating the comparing step. 

4. The method of Claim 1, further comprising 
recognizing the new state as indicative of an attack upon 
the computer network. 
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5. The method of Claim 5, further comprising 
sounding an alarm. 

6. The method of Claim 1, further comprising 
generating the state table from a REGEX command. 
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7 . A system for use as a binary state machine for 
processing a data stream in an intrusion detection 
system, the system comprising: 

a state table indexed such that inputs 
comprising a current state and a current character yield 
an output of a new state, the new state related to an 
attack on a computer network; and 

a state machine communicatively coupled to the 
state table, the state machine operable to: 

maintain the current state; 

receive an input stream, the input stream 
comprising a plurality of characters; 

select a first character of the input 
stream as the current character; and 

compare the current character and the 
current state to the state table to generate a new state. 

8 . The system of Claim 7 further comprising a 
computer readable medium, wherein the state table is 
stored upon the computer readable medium. 

9. The system of Claim 8, wherein the state 
machine comprises software code stored upon the computer 
readable medium, the software code further operable to be 
executed by a computer processor. 

10. The system of Claim 7, wherein the state 
machine is further operable to initialize the current 
state to an initial state. 
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11. The system of Claim 7, wherein the state 
machine is further operable to: 

setting the current state equal to the new 

state; 

5 selecting a next character as the current 

character, the next character appearing subsequent to the 
first character in the input stream; and 
repeating the comparing step. 

10 12. The system of Claim 7, wherein the state 

machine is further operable to recognizing the new state 
as indicative of an attack upon the computer network. 
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13 . A system for use as an intrusion detection 
system, the system comprising: 

a computer readable medium; 

a network interface for receiving an input 
5 stream comprising a plurality of characters; 

a processor communicatively coupled to the 
computer readable medium and the network interface; 

a state table stored upon the computer readable 
medium, the state table indexed such that inputs 
10 comprising a current state and a current character yield 
an output of a new state, the new state related to an 
attack on a computer network; and 

a state machine comprising instructions stored 
upon the computer readable medium and executable by the 
15 processor, the state machine communicatively coupled to 
the state table, the state machine operable to: 
maintain the current state; 
select a first character of the input 
stream as the current character; and 
2 0 compare the current character and the 

current state to the state table to generate a new state. 
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BINARY STATE MACHINE SYSTEM AND METHOD FOR REGEX 
PROCESSING OF A DATA STREAM IN AN INTRUSION DETECTION 
SYSTEM 

ABSTRACT OF THE DISCLOSURE 

A binary state machine system and method for REGEX 
processing of a data stream in an intrusion detection 
system are disclosed. The method comprises maintaining a 
5 state table. The state table is indexed such that inputs 

comprising a current state and a current character yield 
an output of a new state. The new state is related to an 
indication of an attack on a computer network. The 
method further includes maintaining the current state . 
10 An input stream comprising a plurality of characters is 

received. A first character of the input stream is 
selected as the current character. The current character 
and the current state are compared to the state table to 
generate a new state. 

15 
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